{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 日夜图像分类器\n",
    "---\n",
    "\n",
    "日夜图像数据集由200个RGB彩色图像组成，分为两类：白天图像和夜晚图像。每个例子都有相同的数字：100个日图像和100个夜图像。\n",
    "\n",
    "我们希望建立一个分类器，可以把这些图像准确地标记为白天或黑夜。要完成这个任务，我们需要找出这两种图像之间的显著性特征！\n",
    "\n",
    "*注：所有图像都来自 [AMOS 数据集](http://cs.uky.edu/~jacobs/datasets/amos/) （众多户外场景档案）。*\n",
    "\n",
    "\n",
    "### 导入资源\n",
    "\n",
    "在开始使用项目代码之前，请导入你需要的库和资源。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cv2 # computer vision library\n",
    "import helpers\n",
    "\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.image as mpimg\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 训练并测试数据\n",
    "200张日/夜的图像被分成训练和测试数据集。\n",
    "\n",
    "* 这些图像中的60％是训练图像，供你在创建分类器时使用。\n",
    "* 另外40％是测试图像，将用于测试分类器的准确度。\n",
    "\n",
    "首先，我们设置一些变量来跟踪图像的存储位置："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "image_dir_training: the directory where our training image data is stored\n",
    "image_dir_test: the directory where our test image data is stored\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Image data directories\n",
    "image_dir_training = \"day_night_images/training/\"\n",
    "image_dir_test = \"day_night_images/test/\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 加载数据集\n",
    "\n",
    "前几行代码将加载训练日/夜图像，并将它们全部存储在变量`IMAGE_LIST`中。 该列表包含图像及其相关标签（“日”或“夜”）。\n",
    "\n",
    "例如， `IMAGE_LIST` 中的第一个图像标签对可以通过索引访问："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "attributes": {
     "classes": [
      "IMAGE_LIST[0][:]```."
     ],
     "id": ""
    }
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "\n",
    "```python\n",
    "# Using the load_dataset function in helpers.py\n",
    "# Load training data\n",
    "IMAGE_LIST = helpers.load_dataset(image_dir_training)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 构建输入图像和输出标签的`STANDARDIZED_LIST` 函数 \n",
    "\n",
    "该函数将输入一个图像标签对列表，并输出一个包含调整过的图像和数字标签的 **标准** 列表。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Standardize all training images\n",
    "STANDARDIZED_LIST = helpers.standardize(IMAGE_LIST)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 将标准化数据可视化\n",
    "\n",
    "显示一个来自STANDARDIZED_LIST的标准化图片。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Display a standardized image and its label\n",
    "\n",
    "# Select an image by index\n",
    "image_num = 0\n",
    "selected_image = STANDARDIZED_LIST[image_num][0]\n",
    "selected_label = STANDARDIZED_LIST[image_num][1]\n",
    "\n",
    "# Display image and data about it\n",
    "plt.imshow(selected_image)\n",
    "print(\"Shape: \"+str(selected_image.shape))\n",
    "print(\"Label [1 = day, 0 = night]: \" + str(selected_label))\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Shape: (600, 1100, 3)\n",
    "Label [1 = day, 0 = night]: 1\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![png](output_10_1.png)\n",
    "\n",
    "\n",
    "# 特征提取\n",
    "\n",
    "创建一个表示图像亮度的特征。我们将使用HSV颜色空间提取**平均亮度**。具体来说，我们将使用V通道（亮度测量），将V通道中的像素值相加，然后将该总和除以图像的面积，从而获得该图像的平均值。\n",
    "\n",
    "---\n",
    "###  使用V通道查找平均亮度\n",
    "\n",
    "该函数会输入 **标准化** 的RGB图像并返回一个代表图像亮度平均水平的特征（单个值）。我们将使用此值将图像分类为白天或夜晚。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Find the average Value or brightness of an image\n",
    "def avg_brightness(rgb_image):\n",
    "    # Convert image to HSV\n",
    "    hsv = cv2.cvtColor(rgb_image, cv2.COLOR_RGB2HSV)\n",
    "\n",
    "    # Add up all the pixel values in the V channel\n",
    "    sum_brightness = np.sum(hsv[:,:,2])\n",
    "    area = 600*1100.0  # pixels\n",
    "    \n",
    "    # find the avg\n",
    "    avg = sum_brightness/area\n",
    "    \n",
    "    return avg"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Testing average brightness levels\n",
    "# Look at a number of different day and night images and think about \n",
    "# what average brightness value separates the two types of images\n",
    "\n",
    "# As an example, a \"night\" image is loaded in and its avg brightness is displayed\n",
    "image_num = 190\n",
    "test_im = STANDARDIZED_LIST[image_num][0]\n",
    "\n",
    "avg = avg_brightness(test_im)\n",
    "print('Avg brightness: ' + str(avg))\n",
    "plt.imshow(test_im)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Avg brightness: 35.2170090909\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "<matplotlib.image.AxesImage at 0x14525e518>\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![png](output_14_2.png)\n",
    "\n",
    "\n",
    "# 分类和可视化误差\n",
    "\n",
    "在本节中，我们将把我们的平均亮度特征转换为一个分类器，该分类器输入一个标准化图像并返回该图像的一个`predicted_label`。这个 `estimate_label` 函数应该返回一个值：0或1（分别代表夜晚或白天）。\n",
    "\n",
    "---\n",
    "### TODO: 建立一个完整的分类器 \n",
    "\n",
    "更新此代码，以便 i（译者注：原文不完整）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This function should take in RGB image input\n",
    "def estimate_label(rgb_image):\n",
    "    \n",
    "    # Extract average brightness feature from an RGB image \n",
    "    avg = avg_brightness(rgb_image)\n",
    "        \n",
    "    # Use the avg brightness feature to predict a label (0, 1)\n",
    "    predicted_label = 0\n",
    "    threshold = 120\n",
    "    if(avg > threshold):\n",
    "        # if the average brightness is above the threshold value, we classify it as \"day\"\n",
    "        predicted_label = 1\n",
    "    # else, the pred-cted_label can stay 0 (it is predicted to be \"night\")\n",
    "    \n",
    "    return predicted_label    \n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 测试分类器\n",
    "\n",
    "下面我们将要使用我们在notebook开始时放置的测试数据集，来测试你的分类算法！\n",
    "\n",
    "由于我们使用了非常简单的亮度特征，因此可能不希望此分类器100％准确。我们的目标是使用这一函数，使准确度达到75-85％左右。\n",
    "\n",
    "\n",
    "### 测试数据集\n",
    "\n",
    "下面，我们加载测试数据集，使用上面定义的`standardize`函数对其进行标准化，然后对其进行**混洗**；这确保了顺序不会对测试准确度测试造成影响。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "\n",
    "# Using the load_dataset function in helpers.py\n",
    "# Load test data\n",
    "TEST_IMAGE_LIST = helpers.load_dataset(image_dir_test)\n",
    "\n",
    "# Standardize the test data\n",
    "STANDARDIZED_TEST_LIST = helpers.standardize(TEST_IMAGE_LIST)\n",
    "\n",
    "# Shuffle the standardized test data\n",
    "random.shuffle(STANDARDIZED_TEST_LIST)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 确定准确度\n",
    "\n",
    "将你的分类算法（也称为“模型”）的输出与真实标签进行比较并确定准确度。\n",
    "\n",
    "此代码将所有错误分类的图像、其预测标签以及它们的真实标签存储在名为`misclassified`的列表中。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Constructs a list of misclassified images given a list of test images and their labels\n",
    "def get_misclassified_images(test_images):\n",
    "    # Track misclassified images by placing them into a list\n",
    "    misclassified_images_labels = []\n",
    "\n",
    "    # Iterate through all the test images\n",
    "    # Classify each image and compare to the true label\n",
    "    for image in test_images:\n",
    "\n",
    "        # Get true data\n",
    "        im = image[0]\n",
    "        true_label = image[1]\n",
    "\n",
    "        # Get predicted label from your classifier\n",
    "        predicted_label = estimate_label(im)\n",
    "\n",
    "        # Compare true and predicted labels \n",
    "        if(predicted_label != true_label):\n",
    "            # If these labels are not equal, the image has been misclassified\n",
    "            misclassified_images_labels.append((im, predicted_label, true_label))\n",
    "            \n",
    "    # Return the list of misclassified [image, predicted_label, true_label] values\n",
    "    return misclassified_images_labels\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Find all misclassified images in a given test set\n",
    "MISCLASSIFIED = get_misclassified_images(STANDARDIZED_TEST_LIST)\n",
    "\n",
    "# Accuracy calculations\n",
    "total = len(STANDARDIZED_TEST_LIST)\n",
    "num_correct = total - len(MISCLASSIFIED)\n",
    "accuracy = num_correct/total\n",
    "\n",
    "print('Accuracy: ' + str(accuracy))\n",
    "print(\"Number of misclassified images = \" + str(len(MISCLASSIFIED)) +' out of '+ str(total))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Accuracy: 0.86875\n",
    "Number of misclassified images = 21 out of 160"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "<a id='task9'></a>\n",
    "### 将错误分类的图像可视化\n",
    "\n",
    "将分类错误的一些图像（在`MISCLASSIFIED`列表中）可视化，并记下使其难以分类的所有特征。这样做可以帮助你发现分类算法的缺陷。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Visualize misclassified example(s)\n",
    "## TODO: Display an image in the `MISCLASSIFIED` list \n",
    "## TODO: Print out its predicted label - to see what the image *was* incorrectly classified as5\n",
    "num = 0\n",
    "test_mis_im = MISCLASSIFIED[num][0]\n",
    "plt.imshow(test_mis_im)\n",
    "print(str(MISCLASSIFIED[num][1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "1\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![png](output_24_1.png)\n",
    "\n",
    "---\n",
    "<a id='question2'></a>\n",
    "## （问题）：将这些错误分类可视化之后，你认为你的分类算法有哪些缺陷？\n",
    "\n",
    "**答案：** 在这里写下你的答案。\n",
    "\n",
    "# 5.改进你的算法！\n",
    "\n",
    "* （可选）调整阈值，使准确度更高。\n",
    "* （可选）添加另一个函数，用来处理你发现的缺陷！\n",
    "---"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
